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Abstract 

This work concerns formal descriptions of DNA code properties, and builds on 
previous work on transducer descriptions of classic code properties and on trajectory 
descriptions of DNA code properties. This line of research allows us to give a property 
as input to an algorithm, in addition to any regular language, which can then answer 
questions about the language and the property. Here we define DNA code properties 
via transducers and show that this method is strictly more expressive than that of 
trajectories, without sacrificing the efficiency of deciding the satisfaction question. 
We also show that the maximality question can be undecidable. Our undecidability 
results hold not only for the fixed DNA involution but also for any fixed antimorphic 
permutation. Moreover, we also show the undecidability of the antimorphic version 
of the Post Corresponding Problem, for any fixed antimorphic permutation. 


1 Introduction 

The study of formal methods for describing independent language properties (widely known 
as code properties) provides tools that allow one to give a property as input to an algorithm 
and answer questions about this property. Examples of such properties include classic ones 
[4,17,27,28] like prefix codes, bifix codes, and various error-detecting languages, as well 
as DNA code properties [2,10,11,13-15,18-21,25] like 0-nonoverlapping and 0-compliant 
languages. A formal description method should be expressive enough to allow one to 
describe many desirable properties. Examples of formal methods for describing classic code 
properties are the implicational conditions method of [16], the trajectories method of [5], 
and the transducer methods of [8]. The latter two have been implemented to some extent 
in the Python package FAdo [9]. A formal method for describing DNA code properties is 
the method of trajectory DNA code properties [6,21]. 

Typical questions about properties are the following: 
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Satisfaction problem: given the description of a property and the description of a regular 
language, decide whether the language satisfies the property. 

Maximality problem: given the description of a property and the description of a regular 
language that satisfies the property, decide whether the language is maximal with 
respect to the given property. 

Construction problem: given the description of a property and a positive integer n, find 
a language of n words (if possible) satisfying the given property. 

In the above problems regular languages are described via (non-deterministic) finite 
automata (NFA). Depending on the context, properties are described via trajectory regular 
expressions or transducer expressions. The satisfaction problem is the most basic one and 
can be answered usually efficiently in polynomial time. The maximality problem as stated 
above can be decidable, in which case it is normally PSPACE-hard. For existing transducer 
properties, both problems can be answered using the online (formal) language server LaSer 
[24], which relies on FAdo. LaSer allows users to enter the desired property and language, 
and returns either the answer in real time (online mode), or it returns a Python program 
that computes the desired answer if executed at the user’s site (program generation mode). 
For the construction problem a simple statistical algorithm is included in FAdo, but we 
think that this problem is far from being well-understood. 

The general objective of this research is to develop methods for formally describing DNA 
code properties that would allow one to express various combinations of such properties 
and be able to get answers to questions about these properties. While the satisfaction and 
construction questions are important from both the theoretical and practical viewpoints, 
the maximality question is at least of theoretical interest and a classic problem in the 
theory of codes. The contributions of this work are as follows: 

1. The definition of a new simple formal method for describing many DNA code prop¬ 
erties, called 0-transducer properties, some of which cannot be described by the 
existing transducer and trajectory methods for classic code properties; see Sect. 3. 
These methods are closed under intersection of code properties. This means that if 
two properties can be described within the method then also the combined property 
can be described within the method. This outcome is important as in practice it is 
desirable that languages satisfy more than one property. 

2. The demonstration that the new method of transducer DNA code properties is prop¬ 
erly more expressive than the method of trajectories; see Sect. 4. 

3. The demonstration that the maximality problem can be decidable for some trans¬ 
ducer DNA code properties but undecidable for some others; see Sect. 5. 

4. The demonstration that some classic undecidable problems (like PCP) remain un¬ 
decidable when rephrased in terms of any fixed (anti-)morphic permutation 9 of the 
alphabet, with the case 9 = id corresponding to these classic problems, where id is 
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the (morphic) identity; see Sect. 6 . This contribution is mathematically relevant to 
the undecidability of the maximality problem for DNA-related properties, so it is 
natural to include it with the above contributions in one publication. 

5. The presentation of a natural hierarchy of DNA properties which are all ^-transducer 
properties; see Section 7. This hierarchy generalizes the concept of bond-free prop¬ 
erties in [13,18,19]. 

Even though, our main motivation is the description of DNA-related properties, we 
follow the more general approach which considers properties described by transducers in¬ 
volving a fixed (anti-)morphic permutation 0 ; again, the classical transducer properties 
are obtained by letting 0 = id. In the setting of DNA properties, we consider the alpha¬ 
bet A = {A, C, G, T} and 0 = 5 being the involution (i.e., antimorphic permutation with 
5 2 = id) given by 5(A) = T, 5(T) = A, 5(C) = G, and 5(G) = C. As it turns out, in the case 
when 0 is morphic all questions that we consider in this paper can be answered analogous 
to the solutions for the classical case where 0 = id. Therefore, we focus on the transducer 
properties involving antimorphic permutations in this paper. 


2 Basic Notions and Background Information 

In this section we lay down our notation for formal languages, (anti-)morphic permuta¬ 
tions, transducers, and language properties. We assume the reader to be familiar with the 
fundamental concepts of language theory; see e. g., [12,26]. Then, in Sect. 2.2 we recall the 
method of transducers for describing classic code properties, and in Sect. 2.3 we recall the 
method of trajectories for describing DNA-related properties. 

2.1 Formal Languages and (Anti-)morphic Permutations 

An alphabet A is a finite set of letters ; A* is the set of all words or strings over A; e denotes 
the empty word ; and A + = A* \ {e}. A language L over A is a subset L C A*; the 
complement L c of L is the language A* \ L. For an integer meNwe let A- m denote the 
set of words whose length is at most m; i.e., A- m = (j i<m A l . The DNA alphabet is A = 
{A, C, G, T}. Often it is convenient to consider the generic alphabet A& = {0,1,..., k — 1} 
of size k rather than a general alphabet; note that A 2 C A 3 C A 4 C • • •. Throughout 
this paper we only consider alphabets with at least two letters because our investigations 
would become trivial over unary alphabets. 

Let w E A* be a word. Unless confusion arises, by w we also denote the singleton 
language {u>}, e. g., LUw means L U {u>}. If w = xyz for some x,y, z E A*, then x, y , and 
z are called prefix, infix (or factor ), and suffix of w, respectively. For a language L C A*, 
the set Pref ( L ) = {x E A* \ 3 y E A*: xy E L} denotes the language containing all prefixes 
of words in L. If w = Gqcq • • • a n for letters aq, cq,..., a n E A, then |w| = n is the length 
of w, for b E A, | w\ b = |{i | a* = b, 1 < i < n}\ is the tally of b occurring in w, the i-th 
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letter of w is = a* for 1 < i < n; the infix of w from the f-th letter to the j-th letter is 
w [i;j] — ditti+i ■ ■ ■ a,j for 1 < i < j < n] and the reverse of w is w H = a n a n _\ ■ ■ ■ a\. 

Consider a generic alphabet Ak with k > 2. The identity function on Ak is denoted by 
id*:; when the alphabet is clear from the context, the index k is omitted. For a permutation 
(or bijection) 6: Ak —>■ Ak, the permutation 9~ x is the inverse of 9 as usual; i. e., 0oQ~ l = id*, 
(“o” is the composition of two functions (g o h)(x) = g(h(x )) for all x). For i G Z, the 
permutation 9 l is the i-fold composition of 9\ i. e., 6 0 = id*,, 9 l = QoO 1-1 , and 9~ l = ( 9*) _1 = 
(0 -1 )* for i > 0. There exists a number n, called the order of 9, such that 9 n = id^. An 
involution 9 is a permutation of order 2; i. e., 9 = 9~ x . 

A permutation 9 over Ak can naturally be extended to operate on words in A* k as(a) mor- 
phic permutation 9{uv ) = 9{u)9{v), or (b) antimorphic permutation 9{uv ) = 9{y)9(u), for 
u,v G A* k . As before, the inverse 9~ l of the (anti-)morphic permutation 9 over A* k is the 
(anti-)morphic extension of the permutation 6 1 ' 1 over A* k . Note that the composition of 
two antimorphic or two morphic permutations yields a morphic permutation, whereas the 
composition of a morphic and an antimorphic permutation yields an antimorphic permu¬ 
tation. Therefore, if 9 is an antimorphic permutation, then 9 l is morphic if and only if i is 
even. The identity id*, always denotes the morphic extension of id*, while the antimorphic 
extension of id&, called the mirror image or reverse, is usually denoted by the exponent R . 

Example 1. The DNA involution, denoted as 5, is an antimorphic involution on A = 
{A, C, G, T} such that 5(A) = T and 5(C) = G, which implies 5(T) = A and 5(G) = C. 

A language operator is any mapping Op: 2 A * —> 2 A *. The prefix function Pref defined 
earlier is an example of a language operator. A transducer (see Sect. 2 . 2 ) can be viewed 
as a language operator. Any (anti-)morphic permutation, as well as any other function, 
h: A* —> A* over words is extended to a language operator such that for L C A* 

h(L) = U xeL {h(x)}. 

If Op x and Op 2 are language operators, then (Op x V Op 2 ) is the language operator such 
that (Op : V Op 2 )(A") = Op^X) U Op 2 (X), for all languages X. 

2.2 Describing Classic Code Properties by Transducers 

A (language) property V is any set of languages. A language L satisfies V, or has V, if 
L G V. Here by a property V we mean an (n-)independence in the sense of [17]: there 
exists n G N U {H 0 } such that a language L satisfies V if and only if all nonempty subsets 
L' C L of cardinality less than n satisfy V. A language L satisfying V is maximal (with 
respect to V) if for every word w E L c we have L U w does not satisfy V —note that, for 
any independence V, every language in V is a subset of a maximal language in V [17]. To 
our knowledge all code related properties in the literature, including DNA code properties, 
are independence properties. As we shall see further below the focus of this work is on 
3-independence properties that can also be viewed as independent with respect to a binary 
relation in the sense of [28]. 
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A transducer t is a non-detcrministic finite state automaton with output; see e. g., [3,30]. 
In general, a transducer can have an output alphabet B which is different from its input 
alphabet A; thus, defining a relation over A* x B*. In this paper however, we only consider 
transducers where the input alphabet coincides with the output alphabet, A = B, which 
leads to the following simplified definition: a transducer is a quintuple t = (Q, A, E , /, F), 
where A is the input and output alphabet, Q is a finite set of states, if is a set of directed 
edges between states from Q which are labeled by word pairs (u,v) E A* x A*, I is a set 
of initial states, and F a set of final states. For an edge label (u, v ) the word u is called 
input , while the word v is called output. The transducer t realizes the set of all pairs 
(x, y) E A* x A* such that x is formed by concatenating the inputs, and y is formed by 
concatenating the outputs of the labels in a path of t from the initial to the final states. 
If t realizes (x,y) then we write y E t(x). We say that the set t(x) contains all possible 
outputs of t on input x. It is well known that for two regular languages R \, R 2 there exists 
a transducer t that realizes the relation Ri x R 2 ; i. e., t realizes (x, y) if and only if x E R\ 
and y E R 2 . The transducer t _1 is the inverse of t; that is, x E t ~ l (y) if and only if 
y E t(x) for all words x, y. Note that t _1 is obtained from t by simply swapping the input 
with the output word on each edge in t. For a language L we naturally extend our notation 
such that 


t(L) = l4 eL t(x). 

Thus, a transducer can be viewed as a language operator. 

Let 9 be an (anti-)morphic permutation and t be a transducer which are both defined 
over the same alphabet A. The transducer t is called 9-input-preserving if for all w E A + 
we have 9{w) E t (w); t is called 9-input-altering if for all w E A + we have 9{w) ^ t(tc). 
We use the simpler terms input-altering and input-preserving t, respectively, when 9 = id. 
Note that 9{w ) E t(w) is equivalent to w E 9~ l {t(w)) as well as t _1 (6 ) (t(;)) 9 w. 

Definition 2 ([ 8 ]). An input-altering transducer t describes the property that consists of 
all languages L such that 

t(L)nL = 0. ( 1 ) 

An input-preserving transducer t describes the property that consists of all languages L 
such that 

w ^ t(L \ w), for all w E L. (2) 

A property is called an input-altering (resp. input-preserving) transducer property, if it is 
described by an input-altering (resp. input-preserving) transducer. 

Note that every input-altering transducer property is also an input-preserving trans¬ 
ducer property. Input-altering transducers can be used to describe properties like prefix 
codes, bifix codes, and hypercodes. Input-preserving transducers are intended for error¬ 
detecting properties, where in fact the transducer plays the role of the communication 
channel. Figure 1 shows a couple of examples. 

Many input-altering transducer properties can be described in a simpler manner by 
trajectory regular expressions [5,8], that is, regular expressions over {0, 1}. For example, 
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Figure 1: The left transducer is input-altering and describes the prefix codes: on input x 
it outputs any proper prefix of x. The right transducer is input-preserving and describes 
the 1 -substitution error-detecting languages: on input x it outputs either x or any word 
differing from x in exactly one position. Note: in this and the following transducer figures, 
an arrow with label (a, a ) represents a set of edges with labels (a, a ) for all a G A; and 
similarly for an arrow with label (a, e). An arrow with label (a, b ) represents a set of edges 
with labels (a, b ) for all a,b G A with a 7^ b. 


the expression 0 * 1 * describes prefix codes and the expression 1 * 0 * 1 * describes infix codes. 
On the other hand, there are natural transducer properties that cannot be described by 
trajectory expressions [ 8 ]. 

2.3 Describing DNA-related Properties by Trajectories 

In [2,10,11,13-15,18-21,25] the authors consider numerous properties of languages inspired 
by reliability issues in DNA computing. We state three of these properties below. In Sect. 7 
we present a hierarchy of DNA properties which generalizes some of the DNA properties 
presented in [13,18,19]. Let 9 be an antimorphic permutation over A* k . Recall that in the 
DNA setting 9 = 5 is an involution, and therefore, we have 9 2 = id. 

(A) A language L is 9-nonoverlapping if L D 9(L) = 0. 

(B) L is 9-compliant if Vu> G 9(L),x , y G A* k \ xwy G L xy = e. 

(C) L is strictly 9-compliant if it is 0-nonoverlapping and 0-compliant. 

Many of the existing DNA-related properties can be modelled using the concept of a 
bond-free property, first defined in [ 21 ] and later rephrased in [ 6 ] in terms of trajectories. 
We follow the fomulation in [ 6 ]. Let e = (ei, £ 2 ), where C\ and <=2 are two regular trajectory 
expressions. First, we define the following language operators. 

««(£) = (((£ A+) n A+) m a A') U (((£ ~. a A") n A+) LLlg 2 A + ). (3) 

= ((£ A-) n A+) LUa A\ (4) 

The word operations LU* and are called shuffle (or scattered insertion) and scattered 
deletion , respectively, over the trajectory t. A trajectory is any word over {0,1}. For any 
words x,w and trajectory t with |£| 0 = |x| and |t|i = |w|, x lu^ w is the set {y} such that 
the word y is of length |t| and results by the following process which scans the symbols of 
x left to right and also of w left to right. For each index i — 0,..., \t\ — 1, y[i) is the next 
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symbol of x if t[i\ = 0, or the next symbol of w if t[i\ = 1. If ‘|f | 0 = |x| and |f|i = |w|’ 
is not satisfied then x LU t w = 0. For example, 1122 LUooioio 34 = 113242. The reader is 
referred to [ 6 ,22] for more details. For any languages A", W and trajectory expression a, 
we have that 

X lu„ W = xin t w. 

x£X ,w£\V,t€L(a) 

For any words x,w and trajectory t with |t| = |x| and \t\i = |u>|, x w is either the 
set {y} such that the word y is of length |f | 0 = |x| — |w| and satishes {x} = y l_u t w, or 
the empty set otherwise. For example, 113242 ^-ooioio 34 = 1122. The reader is referred 
to [ 6 ,22] again for more details. For any languages A, W and trajectory expression a, we 
have that 

X a W = X -w t w. 

xex,wew,teL(a) 


Definition 3. [[ 6 ]] Let 6 be an involution and ei, e 2 be two regular trajectory expressions. 
The bond-free property described by (ei,e 2 ) is 

B{ei,e 2 ) = {LCA*\ 6{L) n$ SltSa {L) = 0}. (5) 

The strictly bond-free property described by (ei,e 2 ) is 

B’(e i, e 2 ) = {LCA' | 6(L) n *I„ Ja (L) = 0}. ( 6 ) 

A regular 6-trajectory property is a bond-free property described by (ei,e 2 ), or a strictly 
bond-free property described by (ei,e 2 ), for some pair (ei,e 2 ). 

Example 4. The ^-compliant property is a regular d-trajectory property in £?(1*0 + 1*, 0 + ): 
deleting x and y in any xwy (according to 1 * 0 + 1 *), where at least one symbol gets deleted, 
and then inserting nothing (according to 0 + ) cannot result into a word in 9(L). The 9- 
nonoverlapping property is a regular d-trajectory property in S s (0 + , 0 + ): deleting nothing 
and then inserting nothing in any word w cannot result into a word in 9(L). The strictly 
^-compliant property is a regular d-trajectory property in £? S ( 1 * 0 + 1 *, 0 + ): deleting x and 
y in any xwy (according to 1 * 0 + 1 *) and inserting nothing (according to 0 + ) cannot result 
into a word in 9(L). 

We note that the actual definitions of bond-free properties in [ 6 ] are given in terms 
of a pair (Tj,T 2 ) of arbitrary sets of trajectories. However, here we only consider sets of 
trajectories that can be represented by regular expressions. Moreover, the second statement 
of Theorem 12 , in Sect. 4, remains true if one uses (7j, Tf) instead of (ei, e 2 ), as the proof 
makes no use of the fact that the trajectory sets involved are regular. 
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3 New Transducer-based DNA-related Properties 

A question that arises from the discussion in sections 2.2 and 2.3 is whether existing 
transducer-based properties include DNA-related properties. It turns out that this is not 
the case: for instance the h-nonoverlapping property, which seems to be the simplest 
DNA-related property, cannot be described by any input-preserving transducer; see Propo¬ 
sition 8 . In this section, we define new transducer-based properties that are appropriate for 
DNA-related applications, we demonstrate Proposition 8 , and discuss how existing DNA- 
related properties can be described with transducers. Then, in Sect. 4 we examine the 
relationship between the new transducer properties and the regular d-trajectory properties 
which were proposed in [ 6 ]. 

Definition 5. A transducer t and an (anti-)morphic permutation 9 , defined over the same 
alphabet, describe 3-independent properties in two ways: 

1 . ) strict 6-transducer property (S-property): L satisfies the property Sg >t if 

0(L)nt(L) = 0 (7) 

2. ) weak 9-transducer property (W-property): L satisfies the property Wo.t if 

Vw £ L : 9(w ) ^ t(L \ w ) (8) 

Any of the properties iS^t or W^t is called a 9-transducer property. 

The difference between 5-properties and W-properties is that Sg )t forbids that L £ Sg tt 
contains a word w such that any 9[w) £ t(tc), while this case is allowed for L £ We,t- For 
fixed t, 9 , and L, Condition (7) implies that for all w £ L we have 9[w ) D t(L \ w) = 0 
which is equivalent to Condition ( 8 ). In other words, if L satisfies Sg t t , then L satisfies 
Wg,t as well. If 9 — id and t is input-altering, or input-preserving, then the above defined 
properties specialize to the existing ones stated in Definition 2. 

Example 6 . Consider the transducers in Fig. 2. For any word xwy, the left transducer 
t s , say, can delete x, then keep w (which has to be non-empty), and then delete y. Thus, 
t s (L) fl 9(L) = 0 if and only if L is strictly d-compliant. Now let xwy with xy ^ £ and 
w/e. If 2 / is nonempty, the right transducer t can delete x, then keep w, and then delete 
y using the upper path (containing state 1 ); and if x is nonempty, t can delete x, then 
keep w, and then delete y using the lower path (containing state 2). Thus, t(L) fl 9(L) = 0 
if and only if L is d-compliant. Using FAdo [9] format the left transducer can be specified 
by the following string, assuming alphabet {a, b} 

©Transducer 2 * 0\n0 a ©epsilon 0\n0 b ©epsilon 0\n0 a a l\n 
0 b b l\nl a a l\nl b b l\nl ©epsilon ©epsilon 2 \n 2 a ©epsilon 2 \n 
2 b ©epsilon 2 \n 



Figure 2: Together with 0, the left transducer describes the strictly 0-compliant property 
and the right one describes the 0-compliant property. See Example 6 for explanations. 


As in the classic case where 9 = id, also in the general case we have that 6- input-altering 
transducers play an important role for 5-properties because only then the maximality 
question is decidable. We did not fully explore the usefulness of ^-input-preserving for 
antimorphic permutations yet. For morphic 0, however, every transducer t can be modified 
to obtain a 0 -input-preserving transducer t' such that W $ ; t = We ,tb this concept can be 
utilized in order to efficiently decide the satisfaction problem; see Sect. 5. 

Remark 7. Note that only Sg tt for a transducer t which is not 0-input-altering can exclude 
specific words from all languages which satisfy the property 5g t . Otherwise, when t is 
0-input-altering, it must not realize (w,9(w)); and when we consider an W-property, then 
9(w) G t(w) is allowed for w G L. In particular, every singleton language L = {w;} satisfies 
all properties We, t, as well as, Sg.t if t is 0 -input-altering. 

As input-altering transducer properties are a subset of input-preserving transducer 
properties, we only consider the case of input-preserving transducer properties in the next 
two results. 

The next result demonstrates that existing transducer properties are not suitable for 
describing even simple DNA-related properties. 

Proposition 8. The 5-nonoverlapping property is not describable by any input-preserving 
transducer. 

Proof. The singleton language L = {AT} C A* is not 5-nonoverlapping, because the word 
AT = 5(AT) is a 5-palindrome. Analogously to Remark 7, a transducer property P t = W;d,t, 
which is described by some input-preserving transducer t, cannot exclude any singleton 
language. Therefore, we must have L G P t . □ 

The counter example language {AT} used to prove the previous result is rather artificial, 
as in practice code-related languages should have more than two elements. However, the 
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statement remains true even if we focus on languages containing more than one word. This 
case is handled in the next proposition. 

Proposition 9. There is no input-preserving transducer t that satisfies Equation (2) for 
all 5-nonoverlapping languages L having at least two elements. 

Proof. Assume the contrary, that is, there is an input-preserving transducer t such that 
for any DNA language L C A* with at least two element we have 

S(L) fl L = 0 iff V u G L : t(u) fl (L \ u) — 0. 

We can assume that t is in normal form, that is, the label of every edge is of the form 
(a, e) or (e, a), for some a G A. Assume that t has n states, for some positive integer n, 
and let m > n. We have that { A m C m , G m T m } is not 5-nonoverlapping, so without loss of 
generality we have that G m T m G t(A m C m ). Consider an accepting path n of t whose label 
is (A m C m , G m T m ) and say n consists of N consecutive edges, for some positive integer N. 

Then, these edges are Sj_i Si , for i — 1,..., N, so that the concatenation of the 

xfs is equal to A m C m and the concatenation of the yf s is equal to G m T m . As t is in normal 
form, we have N = 4m, and as m > n, there is a smallest integer k > 1 such that state 
Sfc is equal to a previous one, that is Sk = Sj such that j < k. By the choice of k, we 
have k < n < m. Let x — x\ ■ ■ ■ Xj, u — Xj + \ ■ ■ ■ Xk, x' = Xk+i • • -xn, and y = y± - ■ ■ yj, 
v = yj .|_i ■ ■ - yki y' — Uk+i • • • Un- As j — k > 0 and t is in normal form we have that 

\u\ >0 or \v\ > 0. (9) 

Using a standard pumping argument for finite state machines, the path that results if we 
delete from 7 r the k — j edges between Sj and Sk is also an accepting path whose label is 
{xx', yy'). As each Xi and y l is of length 0 or 1 , we have \xu\ < k < m and \yv\ < m, 
and also \u\ < k — j and |u| < k — j. This implies xx' = pj n -\ u \c m and yy' = T m . 

As xx' 7 ^ yy' and yy' G t(xx') we have that {xx',yy'} is not 5-nonoverlapping, which 
implies xx 1 = d(yy'), that is, A TO- I u lc m = and, therefore, \u\ = |u| = 0 which 

contradicts (9). □ 

4 Expressiveness of Transducer-based Properties 

In this section we examine the descriptive power of the newly defined transducer DNA- 
related properties, that is, the ^-transducer properties. In Theorem 12 we show that these 
properties properly include the regular d-trajectory properties. On the other hand, in 
Proposition 10 we show that there is an independent DNA-related property that is not a 
d-transducer property. 

Proposition 10. The 6-free property (defined below) [13] is not a 6-transducer property. 
(D) A language L C A* is d-free if and only if L 2 fl A + 6(L)A + = 0. 
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Proof. First note that every ^-transducer property is 3-independent, so it is sufficient to 
show that, for 9 = 6 and A = A, the 0-free property is not 3-independent. Assume the 
contrary and consider the language 

K = {ACGT, CCAC, GTAA}. 

This is not 5-free, as ACGT = <5(ACGT) and CCACGTAA G A + ACGTA + . On the other hand, 
one verifies that every nonempty subset of K of cardinality less than 3 is 5-free, so by our 
assumption also K must be 5-free, which is a contradiction. □ 

The remainder of this section is devoted to Theorem 12. Recall the DNA alphabet is 
A = {A, C, G, T}. The following DNA language property is considered in Theorem 12 

PL = {L C A* | H(u , 9(v)) > 2, for all u, v G L}, 

where is the Hamming distance function with the assumption that its value is oo 

when applied on different length words. Note that PL is described by 5 and the transducer 
shown in Fig. 3. 



Figure 3: The transducer describing, together with 6, the iS-property PL. 

Example 11. The following DNA languages do not satisfy PL: 

L 0 = {AGG, CCA}, L' 0 = {GAG, CCC}. 

For instance, PL (CCA, 6 (AGG)) = 1. The following languages satisfy PL: 

L x = {ACG, GAT}, L 2 = {CAC, GCT}, 

L 3 = {AAA, CCT}, L 4 = {AAA, CTC}, L 5 = {AAA, TCC}. 

For instance, as 5(AAA) = TTT and all words u G L 3 contain at most one T, it follows that 

H(u,5( AAA)) > 2. Now using <5(CCT) = AGG, one verifies that H(u,5( CCT)) > 2 for any 

u G L 3 . Thus, indeed L 3 satisfies PL. 

Theorem 12. 

1. Let 6 be an antimorphic involution. Every regular 6-trajectory property is a 9- 
transducer property. 

2. Property PL is a 5-transducer property, but not a (regular) 6-trajectory one. 
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Proof. We use the following notation: <Fg for either of the operators and <3>§, and B'(e) 
for either of the properties 13(e) and B s (e). 

For the first statement, we show that given any trajectory regular expression a, each of 
the following operators is a transducer operator 

tf(X) = 

t§(x) = 

t|(X) = X~* S A+ 
t“(X) = Xm a A + 

The statement then would follow by noting that if t and s are transducer operators then 
also (t o s) and (t V s) are transducer operators [3], and if a is an automaton, then one can 
construct the transducer (s j' a) such that y G (s f a )(x) if and only if y G s(x ) fl L( a) 
[23]. For example, for any pair e = (ei,e 2 ), we have that 

$|(L) = (t? o (tf 1 1 a +))(L), 
where a + is any automaton accepting A + . 

The claim about is already shown in [8]. For the claim about t^, first note that 
A" l_u a A* = (X LUa A + ) U (A" LU a {e}), so is equal to (t^ Vt S) id), where t S)i d is a transducer 
with t g)i d(a;) = x l_u a {e} and defined as follows. First note that by definition, y G x l_u a {e} 
if and only if y — x and 0^ G L(a). Let a be an automaton with no empty transitions 
accepting L(a). Then, t a ,id is made based on a as follows. Its set of transitions consists 
of all tuples (p, a/a, q ) such that ( p , 0, q ) is a transition of a —we say that the latter is the 
corresponding transition of the former. The initial and final states of t a i d are those initial 
and final states, respectively, of a that appear in the transitions of t Sji d. It follows that 
t a ;d realizes a pair (x, y) of words using some path P of transitions, if and only if x — y 
and the automaton a accepts 0^ using a path consisting of the corresponding transitions 
that make the path P. 

In [22] it is observed that y G (x w) if and only if x G (y l_U/ w), for all words x, y , w 
and trajectories t, which implies that tg and t“ are simply the inverses of the transducers 
and t^, respectively. 

For the second statement we recall that PL is described by 6 and the transducer shown 
in Fig. 3. For the second part of the statement, we argue by contradiction, so we assume 
that there is a pair of trajectory regular expressions e = (ei, e 2 ) such that 

H = B ? (e i,e 2 ). 

Using the definition of <F ? , one verifies that 

<F ? (a) C aA*, for all a G A. 

Consider the DNA language K = {A, C}. One verifies that K does not satisfy PL , but on 
the other hand S(K) 0 &l(K ) = 0, which means that K satisfies £> ? (ei, e 2 ), which leads to 
the required contradiction. □ 
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The counter example used to prove the second statement of Theorem 12 is a little 
artificial, as the language K = {A, C} consists of 1-letter words, which is of no practical 
value. The next result gives a stronger statement, as it requires that all words involved are 
of length at least 2 . 

Proposition 13. The following property 

PL 2 = {L C A* | \u\ > 2 and H(u, 9{v )) > 2 , for all u,v G L} 

is a S-transducer property but not a 5-trajectory property. 

The proof of this results require a couple of intermediate results, which we present next. 

Lemma 14. Let x, y be any words and s, t be any trajectories. Ify G ((x A*)nA + )LU t A* 
then 

|t| — |s| = |f|i — |s|i = \y\ — |x| and |s|i < |x|. 

Proof. The premise of the statement implies that y E. z\±l t w 2 and z G ((x ~^ s W \) D A + ) 
for some words z,wi,W 2 with \z\ > 0. Informally, this means that y results by deleting 
|uq| symbols from x, with |wi| < |x|, and then inserting |xc? 2 1 symbols. More formally as 
\t\ = \y\ and |s| = |x|, we have that |t| — |s| = \y\ — |x|. Also as \z\ = |x| — |wi| = |s| — |s|i, 
we have that |s| > |s|i and, therefore, |x| > |s|i, as required. Now, we have 

|s|i = |wi| = |x| - |z| = |x| - (\y\ - \w 2 \) = |x| - \y\ + |t|i 

and, therefore, |t|i — |s|i = \y\ — |x|. □ 

Lemma 15. Let e = (ei,e 2 ) be a pair of trajectory regular expressions and assume that 
n = b- (e)— as we shall see further below this assumption leads to a contradiction. 

1. There is no pair ( s,t ) of trajectories in L[e 1 ) x L{e 2 ) such that |s| = \t\ = 3 and 
l s |i — |^| 1 = 2. 

2. If x,y are DNA words of length 3 and ( s,t ) G L(e 1 ) x L{e 2 ) such that x 7 ^ 5(y) and 

y G ((x A*) D A + ) lu £ A* then |s| = |t| = 3 and |s|i = |t|i = 1. 

3. We have that 010 G L{e 1 ) or 010 G L(e 2 ). 

4 . We have that (001,001) G L(e 1 ) x L(e 2 ) or (100,100) G L(e 1 ) x L(e 2 ). 

Proof. We shall use some of the seven languages in Example 11. 

For the first statement, assume for the sake of contradiction that the two trajectories 
have equal length and exactly two Is each. By applying (AAA A*) n A + followed by 
LUfA*, the result is $ ? (AAA) and is equal to AAA or AAA or AAA, depending on whether 

t = Oil or t — 101 or t — 110, respectively. More specifically, if t — Oil then 4> ? (AAA) 

contains 5(CCT), which contradicts the fact that L 3 satisfies PL. If t — 101 then <1> ? (AAA) 

contains 5(CTC), which contradicts the fact that L 4 satisfies PL. If t — 110 then <1> ? (AAA) 

contains 5(TCC), which contradicts the fact that L 5 satisfies PL. 
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For the second statement, Lemma 14 implies that |s| = \t\ =3 and |s|i = |i|i < 1, and 
x 7 ^ S(y) implies that |s|i ^ 0. Hence, |s|i = |i|i = 1, as required. 

For the third statement, the fact that L ' 0 does not satisfy PL implies that there are 
words u,v G L ' 0 such that S(v ) G <l> ? (w) and, therefore, there are words wi,w 2 and ( s,t ) G 
L(ei) x L{e 2 ) such that 

S(v) G ((w ~^ s U>i) fl A + ) LU t w 2 . 

By the previous statement, |s| = \t\ =3 and |s|i = |t|i = 1, which implies |wi| = \w 2 \ = 1. 
For the sake of contradiction assume s ^ 010 and t ^ 010. Let u = u\u 2 u 3 with each 
Ui being a symbol. There are four cases about the values of s and t, all of which lead to 
contradictions. For example, if s — 001 and t = 001 then S(v) = U\U 2 w 2) which implies 
that v = w 2 u 2 Ui. By inspection, one verifies that uiu 2 u 3 ,w 2 u 2 ui cannot be both in L' 0 . 

For the fourth statement, the fact that L 0 does not satisfy T~L implies that there are 
words u,v G L 0 such that 5(v) G <L ? (m) and, therefore, there are words w±,w 2 and (s,£) G 
L(ei) x L(e 2 ) such that 

S(v) G ((u wi) n A + ) LUt w 2 . 

By a previous statement, |s| = \t\ — 3 and |s|i = |t|i = 1, which implies |wi| = \w 2 \ = 1. 
Let u = u\u 2 u 3 with each tq being a symbol. The rest of the proof consists of four parts: 

s = 010 leads to a contradiction; 
t = 010 leads to a contradiction; 
s = 001 implies t = 001 ; 
s = 100 implies t = 100 . 

We demonstrate the first and fourth parts and leave the other two parts to the reader to 
verify. For the first part, if s = 010 then depending on whether t = 001 or t — 010 or 
t = 100, we have that 5{v) = UiU 3 w 2 or S(v) = w 2 uiu 3 or S(v) = w 2 uiu 3 , and hence, 
v = w 2 u 3 ui or v = w 2 u 3 u\ or v = u 3 u\w 2 . One verifies by inspection that, in any case, it 
is impossible to have u,v G Lq. Finally for the last part, if s = 100 then, as t cannot be 
010, we have that S(v) = u 2 u 3 w 2 or 5(v) = w 2 u 2 u 3 and hence, v = w 2 u 3 u 2 or v = u 3 u 2 w 2 . 
One verifies by inspection that, in either case, it is impossible to have u, v G L 0 . □ 

Proof. (Of Proposition 13.) The fact that PL 2 is a 5-transducer 5-property is established 
using the transducer in Fig. 4. For the second part of the statement, we argue by con- 



Figure 4: The transducer describing, together with 5, the 5-property TL 2 . 
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tradiction, so we assume that there is a pair of trajectory regular expressions (ei,e 2 ) such 
that 

7-L2 = B\e i,e 2 ). 

By Lemma 15, we have that 001 G L(e 2 ) or 100 G L(e 2 ), and that 001 G L(ei) or 
100 G L(ei). Moreover, we can distinguish the following four cases, which all lead to 
contradictions. We also consider the languages L\ and L 2 defined in Example 11. 

Case £ 010 G L(e 1) and 001 G L(e 2 ) ’■ Then, GCT results into GT, then into GTG and then 
into CAC using, respectively, the operations ^010, LU001 and 5 , which contradicts the fact 
that L 2 satisfies T-L. 

Case £ 010 G L(e 1) and 100 G L(e 2 ) Then, GAT results into GT, then into CGT and then 
into ACG using, respectively, the operations ^010, LUioo and 5 , which contradicts the fact 
that L 1 satisfies Ti. 

Case ‘001 G L{e 1) and 010 G L(e 2 ) ’. Then, ACG results into AC, then into ATC and then 
into GAT using, respectively, the operations -w 0 oi, LU010 and S, which contradicts the fact 
that L\ satisfies di- 

Case £ 100 G L(e 1) and 010 G L(e 2 ) Then, CAC results into AC, then into AGC and then 
into GCT using, respectively, the operations -wioo, LU010 and S, which contradicts the fact 
that L 2 satisfies T-L. □ 

5 The Satisfaction and Maximality Problems 

For 9 = id and for input-altering and -preserving transducers the satisfaction and max- 
imality problems are decidable [ 8 ]. In particular, for a regular language L given via an 
automaton a, Condition ( 1 ) can be decided in time 0(\t\ |a| 2 ), where the function |-| returns 
the size of the machine in question (its number of edges plus the length of all labels on the 
edges). Condition (2) can be decided in time 0{\t\ |a| 2 ), as noted in Remark 16. The max- 
imality problem is decidable, but PSPACE-hard, for both input-altering and -preserving 
transducer properties. 

Remark 16. Let s = tj,aj'abe the transducer obtained by two product constructions: 
first on the input of t with a; then, on the output of the resulting transducer with a. In [ 8 ] 
the authors suggest to decide whether or not L satisfies the input-preserving transducer 
property Wjd,t by testing if the transducer s is functional (|s(a;)| < 1 for all x G A*). 
However, deciding L G Wid.t can be done by the cheaper test of whether or not s implements 
a (partial) identity function (s(x) = {x} or s(x) = 0 for all x G A*). Using the identity 
test from [1], we obtain that Condition (2) can be decided in time (9(|t||a| 2 ) when the 
alphabet is considered constant. Also note that the identity test does not require that t 
is input-preserving if 9 — id. When 9 is antimorphic, however, the identity test does not 
work anymore and we have to resort to the more expensive functionality test for ^-input- 
preserving transducers. 

In this work we are interested in the case when 9 7 ^ id is antimorphic; furthermore, the 
6 - input-altering or -preserving restrictions on the transducer are not necessarily present 
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in the definition of W-properties or 5-properties. Table 1 summarizes under which con¬ 
ditions the satisfaction and maximality problems are decidable for regular languages. For 
the satisfaction problem, except for the case of non-restricted transducer W-properties, 
Conditions (7) and ( 8 ) can be tested similarly to Conditions (1) and (2). For the case of 
non-restricted transducer W-properties, we show decidability using a different method; see 
Sect. 5.1. The undecidability result holds for every fixed permutation 0 over an alphabet 
with at least two letters, in particular, all results apply to the DNA-involution <5. All 
maximality results are discussed in Sect. 5.2. 


Problem 

Property 5 e,t 

no restriction | t is 0-i.-altering 

Pi 

no restriction 

operty We,t 

t is 0-i.-preserving 

Satisfaction 

decidable in 0 ( t |a| 2 ) 
as in [8] 

decidable 
Theorem 21 

decidable in 0 ( t| 2 a| 4 ) 
as in [8] 

Maximality 

undecidable 
Corollary 26 

decidable, PSPACE-hard 

Theorem 22 , Corollary 23 


Table 1: (Un-)decidability of the satisfaction and the maximality problems for a fixed 
antimorphic permutation 0, a given transducer t, and a regular language L given via an 
automaton a. 

Remark 17. We note that deciding the satisfaction question for any 0-trajectory property 
involves testing the emptiness conditions in (5) or ( 6 ), which requires time (!?(|a| 2 |ai 11a 2 1), 
where ai,a 2 are automata corresponding to ei,e 2 . Such a property can be expressed as 
0-transducer 5-property (recall Theorem 12) using a transducer of size d(|a 1 ||a 2 |) and, 
therefore, the satisfaction question can still be solved within the same asymptotic time 
complexity. 

5.1 The Satisfaction Problem for non-restricted W-properties 

We establish the decidability of non-restricted transducer W-properties for regular lan¬ 
guages. We do not concern the complexity of this algorithm; optimizing the algorithm and 
analyzing its complexity is part of future research. Let t be a transducer, 0 be an antimor¬ 
phic permutation, and L be a regular language over the alphabet A. Let and a e{L) be 
the NFAs accepting the languages L and 0(L), respectively. Let s = (Q s , A, E s , I s , F s ) = 
t j, sll t a 0(L) be the product transducer such that y G s(x ) if and only if y G t(x), x G L, 
and y G 0(L). We consider s to be trim, i.e., every state in Q s lies on a path that leads 
from an initial state to a final sate. Furthermore, s is considered to be in normal form 
such that every edge is either labeled (a, e) or (e, a) for some letter a G A. Thus, for any 

path p ( ' x,?;) > * q of length (. (the path has i edges) in s we have \xy\ = t. 

Lemma 18. Let L be a regular language, t be a transducer, 0 be an antimorphic involution, 
and s = t | a/, | &o(L) (all defined over A). The regular language L satisfies Wg,t if and 
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only if for all words x,y G A + 


y G s(x) =>- 9(x) = y. 

Proof. We will prove the contrapositive: L ^ Wg,t if and only if there exists x, y G A + such 
that y G s(x) and 9(x) ^ y. Recall that L VV^t if and only if there exists w G L such 
that 9{w) G t(L \ w ). 

Assume that L ^ Wy t and, therefore, w G L exists such that 9{w) G t(L \ w ). Let 

x G L\w such that 9{w) G t(x) and y = 9{w) G 9(L). Clearly, we have y G s(x) and 

y ~f~ 0{x). 

Conversely, assume that x,y G A + exists such that y G s(x) and y ^ 9(x). Let 

w = 9~ 1 (y) and note that w G L (because y G 9(L)), x G L\w, and 9{w) G t(x) C t (L\w). 

Therefore, L We,t- □ 

Let T s = {(xi,x 2 ,x 3 ) G (A*) 3 | |xix 2 x 3 | < |s|} be a set of word triples. Note that 

the length restrictions for the words ensures that T s is a finite set. For each triple t = 

(xix 2 x 3 ) G T s we define a relation 

Rt = {{xi{x 2 ) k x 3 ,9(x 1 (x 2 ) k x 3 )) | k G N} C A* x A*. 

Note that we allow that any word of Xi,x 2 , x 3 is empty; in particular, if x 2 = x 3 = £, then 
R t contains only one pair of words ( Xi,9(xi )). 

Lemma 19. Let L be a regular language, t be a transducer, 9 be an antimorphic involution, 
and s = t l f &g(L) (all defined over A). The regular language L satisfies W@,t if and 
only if the relation realized by s satisfies 

s C U Rt. (10) 

teT s 

Proof. Recall that for every ( x,y ) G R t with t G T s we have 9(x) = y. If s satisfies 
Equation (10), then for all (x, y) which are realized by s, we have 9(x) = y\ and by 
Lemma 18 L satisfies W^t- 

Conversely, suppose that L satisfies W^t, let ( x,y ) be a pair of words that is realized 
by s, and note that 9(x) = y by Lemma 18. If |x| < |s|, then (x, 9{x)) = ( x,y ) G R t for 
t = (x, e, e) G T s . 

Otherwise, every accepting path in s that is labeled by (x, 9{y)) contains more than |s| 
edges, and therefore, must have a repeating state p 

s p ( X3 ’ y3 \ * f 

such that x = xix 2 x 3 , 9(x) = ypy 2 y 3 , sG/ s ,/G F s , x 2 y 2 ^ e, |xix 2 ?/i|/ 2 | < |s| (using the 
pigeonhole principle). By Lemma 18 for all i G N 

xix* 2 x 3 = 9~ 1 (y 1 y l 2 y 3 ) = 9~ 1 (y 3 )9- 1 (y 2 ) l 9~\y 1 ). 
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Firstly note, that this implies \x%\ = I 2 / 2 1- Now, consider i = 2\x\. Because \x\X 2 X 3 \ > 

|s| > \xix 2 yiy 2 \, we have that 0~ 1 (y 2 )0~ 1 (yi) is a suffix of x 3 . Since i is sufficiently large, 
the suffix X 2 X 3 of x\x 2 x 3 cannot overlap with the prefix 0~ 1 {y 3 ) of xix 2 x 3 . Hence, there 
exists a suffix u of 0 ~ l (y 2 ) and an integer j > 2 such that 

x 2 x 3 = ue~ 1 (y 2 yd~ l (y l ). 

Chose v such that d~ 1 (y 2 ) = vu and note that x 2 = uv because \x 2 \ = I2/2 1 (this argument 
is a special case of the well-known Fine and Wilf’s Theorem). Let x 3 = uO^^yi) and 
observe that x 3 = u{yuy~ 1 0^ 1 {yi) = x 2 ~ 1 x 3 . Furthermore, \x\x 2 x' 3 \ < \xix 2 yiy 2 \ < |s|. 
We conclude that (x,0(x)) = (x 1 x 2 x 3 , 0(x 1 x 2 x 3 )) G R t for t = (xi,x 2 ,x 3 ) G T s . □ 

In order to test whether or not Equation (10) is satisfied, we perform two separate 
tests. Firstly, we test whether or not s satisfies the weaker condition 

sC [^J ( X\X* 2 X 3 ) X 0(xiX 2 X 3 ). (11) 

(x lt X2,X3 )£T S 

Secondly, we ensure that 

Vx,y: y es(x) =► \x\ = \y\. (12) 

Lemma 20. Equation (10) is satisfied if and only if Equations (11) and (12) are satisfied. 

Proof. If Equation (10) is satisfied, then Equation (11) is satisfied because R( xi ,x 2 ,x 3 ) Q 
(. X\X 2 x 3 ) x 0(x\x 2 x 3 ) for (xi,x 2 ,x 3 ) G T s . Also note that for all (x,y) G R t with t G T s we 
have |a:| = \y\; therefore, Equation (10) implies Equation (12). 

Conversely, assume that Equations (11) and (12) are satisfied. For all (x,y) that are 
realized by s we have there exists (xi,X 2 ,x 3 ) G T s and i,j G N such that x = x\x 2 x 3 
and y = 0{xix 2 x 3 ). Since the equation |x| = \y\ must also be satisfied, it is clear that 
i — j and, hence, (x,y) G R( Xl ,x 2 ,x 3 )- We conclude that Equations (11) and (12) imply 
Equation 10. □ 

Theorem 21. Let L be a regular language given as automaton, t be a given transducer, 
and 0 be a given antimorphic involution (all defined over A). It is decidable whether L 
satisfies Wox or not. 

Proof. According to Lemmas 19 and 20 we have to decide whether or not the two Equa¬ 
tions (11) and (12) are satisfied for the transducer s = t f & L f a e ^ L ). It is known that it is 
decidable whether or not a given transducer is included in a recognizable relation (that is a 
relation (J " =1 A t x £>j for regular Aj,5j); see [3]. Therefore, the inclusion in Equation (11) 
is decidable. 

The property in Equation (12) can be verified by an algorithm that assigns an integer 

to each state in s: the integer i is assigned to q G Q s if there exists a path s — '?\ * q 
from a starting state sG/ s such that i — \x\ — \y\. The test fails if a state is assigned two 
distinct integers or if a final state from F s is assigned an integer different from 0; otherwise, 
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the test is successful. Assigning the integers can be done by a simple depth-first traversal 
of s. We omit further details on the implementation of this algorithm as it can be done 
analogously to the test whether or not a given transducer implements a (partial) identity 
function which can be found in [ 1 ], □ 

5.2 The Maximality Problem 

Here we show how to decide maximality of a regular language L with respect to a 19- 
transducer property; see Theorem 22. This result only holds when we consider W-properties 
or when we consider 5-properties for ^-input-altering transducers. As in the case of exist¬ 
ing transducer properties, it turns out that the maximality problem is PSPACE-hard; see 
Corollary 23. When we consider general 5-properties, the maximality problem becomes 
undecidable; see Corollary 26. 

Theorem 22. For an antimorphic permutation 9, a transducer t, and a regular language 
L, all defined over A* k , such that either 

i. ) L E W^t or 

ii. ) L G Sg , t and t is 6-input altering, 

L is maximal with property W^t (resp., Sg )t ) if and only if 

L U 9~ 1 (t(L)) U t _1 (0(L)) = A* k . (13) 

Proof, i.) Suppose L U 0 -1 (t(L)) U t _1 (0(L)) = A* k . For every word w E L c we have 
9(w ) G t(L) or w G t -1 (0(L)). In the former case, we immediately obtain that L U w does 
not satisfy Wg.t- hi the latter case, there exists u G L such that 9{u) G t (w), and therefore, 
L U w does not satisfy W^t- We conclude that L is maximal with respect to W^t 

Conversely, suppose there exists a word w such that w ^ L U 0 _1 (t(L)) U t _1 (0(L)). 
Clearly, w G L c . Furthermore, we must have 6 {w) ^ t(L) and 9[u) ^ t(iu) for all u G L. 
Since L G We, t , we also have that 9{u) ^ t(L \ u) for all u G L. Thus, we obtain that 
V« G (L U w): 9{u ) ^ t((L U w) \ u), and therefore, L is not maximal with respect to Wg,t- 
ii.) Suppose L U 0 _1 (t(L)) U t -1 (0(L)) = A* k . For all w E L c we have 9{w) D t(L) 7 ^ 0 
or t [w) fl 9(L) 0. Thus, LLiw does not satisfy 5^ and L is maximal with respect to 5^ 

Conversely, suppose there exists a word w such that w ^ L U 0 -1 (t(L)) U t _1 (0(L)). 
Hence, 9{w) fl t(L) = 0 and t(tc) fl 9{L ) = 0. Furthermore, we have 9(L) ft t(L) = 0 
because L E TV^t and 9{w) fl t (w) = 0 because t is 0-input-altering. We conclude that 
LU w satisfies and therefore, L is not maximal with respect to We,t- D 

We note that it is PSPACE-hard to decide whether or not Equation (13) holds when 
L is given as NFA because it is PSPACE-hard to decide universality of a regular language 
given as NFA ( L C A* k is universal if L = A* k ) [29]. 

Corollary 23. For an antimorphic permutation 9, a transducer t, and a regular language 
L given as NFA, all defined over A* k , such that either 
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i. ) L G We, t or 

ii. ) L G S() t and t is 6 -input altering, 

it is PSPACE-hard to decide whether or not L is maximal with property We,t (resp., Sg,t)- 

Proof. According to Theorem 22 deciding maximality of L with property We,t (resp., Sg, t) 
is equivalent to deciding universality of L U 0 _ 1 (t (L)) U t _ 1 (0(L)). Let t@ be a transducer 
without final state which does not accept any pair of words. Now, L is maximal with 
property <Se,t 0 (resp., We,t 0 ) if and only if L is universal—a problem which is known to be 
PSPACE-hard. ’ □ 

In the rest of this section we show that it is undecidable whether or not a transducer is 
^-input-preserving. This question relates directly to the maximality problem of the empty 
language 0 with respect to the property Sg, t , as stated in Corollary 26. We will reduce the 
famous, undecidable Post correspondence problem to the problem of deciding whether or 
not a given transducer is 0 -input-preserving. 

Definition 24. The Post correspondence problem (PCP) is, given words a 0 , a i, • • •, aq-i £ 
£ + and /3 0 , /?i,... ,/3^_i G E + , decide whether or not there exists a non-empty sequence of 
integers A, * 2 , • • •, i n € A-t — {0,1 ,..., l — 1} such that 

• • • oii n f3i 1 (3i 2 • • • f3i n . 

It is well-known that the PCP is undecidable, even if £ = A 2 is the binary alphabet. 

Theorem 25. For every fixed antimorphic permutation 0 over A* k with k >2 it is unde¬ 
cidable whether or not a given transducer is 0 -input-preserving. 

Proof. Let a 0 , aq,..., aq_i G E + and fi 0 , fii ,..., j3g_i G £ + be the PCP instance A. We 
will define a transducer which accepts all pairs (w, 9{w )) unless w is a binary encoding of 
a word uv where u G E + and v G A^ such that v describes an integer sequence i\, ? 2 ,..., i n 
that is a solution of A and u is the corresponding solution word. For the ease of notation, 
we assume that E and Ai are two disjoints alphabet and we let T = SUd; be their union. 
For m = [log 2 |T|~|, we let h: T —> A™ be a morphic block code; i. e., an encoding of T into 
binary words of length m such that h(a) = h(b ) implies a = b for all a, b G T. Our goal is 
to define such that 6 {w) f tjfiw) if and only if w = h(uv) for u G E + , v G Aj~, n = |v|, 
and 

U — OLy^ • • • Qfyjj ^ v [n] @V[n—l] ' ' ' ]• 

The transducer will consist of 3 effectively constructable components t^, t a , and 
tp. Each component can be seen as a fully functional transducer such that becomes the 
union of the three transducers; this implies that 

y G t^(x) y G t fl (x) U t a (x) U tp(x). 

Each transducer component “validates” a certain property of a word w, by accepting all 
word pairs ( w,9(w )) which do not have that property: 
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1.) t r accepts (w,9(w)) if and only if w ^ h(£ + A^)\ 


2. ) for w G h(m>) with u G S + and G Aj~, t a accepts ( w,9(w )) if and only if u ^ 

a V[ n ] a v[ n -i] ''' a u[i]> and 

3. ) for w G h(uv) with u G £ + and v G , tp accepts ( w,9(w )) if and only if u ^ 

Pv[n,]Pv [n _ i] ' ' ‘ &[!] • 


The first component ensures that every pair ( w,9(w )) that is not accepted by must 
have the desired form w G h(uv) with u G £ + and v G A^. Components t a and tp ensure 
that 

av [n] av [n- 1] ' ' ' av [l] ~ U — Pv[ n ]Pv [n _ 1] ' ' ' A>[1] 

is the solution word that corresponds the integer sequence i>[ n ], ^[n-i],..., uq] if ( w,9(w )) 
is not accepted by t^. Therefore, every word pair (■ w,9(w )) which is not accepted by 
yields a solution for A and, vice versa, every solution for A yields a word pair ( w,9(w )) 
that cannot be accepted by t^. We conclude that is 9- input-preserving if and only if the 
PCP instance A has no solution. This implies that for fixed antimorphic 9 over A* k with 
k > 2 it is undecidable whether or not a given transducer is ^-input-preserving because 
the PCP is undecidable. 

Now, let us describe the transducer component t# and recall that it has to work over 
the alphabet A^. It is well known that for any two regular languages R\ and R 2 there 
effectively exists a transducer which accepts the relation R± x R 2 . There is t r such that 
t r = ( A* k \ h{TA A^)) x A* k . It is easy to observe that we have t r(w) = A* k if w ^ 
h(Tj + A /), and t R (w) = 0 if w G h(jAA^). Therefore, we have 9{w) t R (w) if and 
only if w ^ h(Jl + A^). Note that this in particular implies that, if 9(w) ^ t r(w), then 
w G h(T*) C (A™)*. The other two transducer components t Q and tp will only work over 
word pairs from h(T*) x 9(h(T*)). 


VaST: ( h{a),e ) 

V* G At : (, h(zi ), 6{h{i))) Va G T : (e, 0{h{a))) 


t 2 : 



Vi, j G A e : ( h(i),0(h(j ))) 
Va,6 G E: (0(a), 0(h(b))) 


Vi G A u z' G E^ z *l \Pref(zi): 


(h(z'),9(h(i))) 



Figure 5: For z G {a,/3} the two transducers t Q and tp enforce that w encodes a solution 
of the PCP instance A if 9(w) ^ (t a + tp)(w) and w G h(YAA^). 

Finally, we define the two transducers t a and tp which are based on the words cq and fa, 
respectively. For z G {a, /5} we define t z as shown in Fig. 5. For a pair of words (x, y ) G t z , 
it is easy to see that x G h(T*) and y G 0(h(T*)). Furthermore, the edges from the final 
state f z to itself ensure that if (x,y) G 0, then for all words x' G h(T*) and y' G 9(h(T*)), 
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we have (xx',yy f ) G t z (we will not leave the final state anymore once it is reached, unless 
the word pair is not defined over h(T*) x 9(h(T*))). There are three possibilities to switch 
from state s z to the final state f z : 

1 . ) we read a word from h{Af) in the first component and a words from 9{h{Af)) in the 

second component; 

2 . ) we read a word from h(E) in the first component and a words from 0 (fi(E)) in the 

second component; or 

3. ) we read the word 9{h[i )) with i G Ag in the second component and in the first com¬ 

ponent we read a word h(z') such that z! is not a prefix of Zi and Zj is not a prefix z' 
because of the length restriction on z'. 

For x G h(T*) let u denote the longest word in £* such that h{u) is a prefix of x (thus, 
either x = h{u ) or x = h(uix') for an integer i G At and x' G T*); and for y G 9(h(T*)) let 
v denote the longest word in A* t such that 9{h{v )) is a prefix of y and let n = |u| (thus, 
either y = 9(h[v )) or y — 9{h{y'av )) = 9(h(v))9(h(a))9(h(y')) for a symbol a G £ and 
y' G T*). Because 9(h(v[ n ]))9(h(v[ n - 1 ])) • ■ ■ 0(/i(u[i])) is a prefix of y we obtain that the pair 
(x, y ) is accepted by t z if u ^ z V[n] z V[n _ 1} ■ ■ ■ z V[1] . Conversely, if u = z V[n] z V[n _ rj ■ ■ ■ z V[1] , then 
(h(u ), 9(h(v))) labels a path from s z to s z ; since there is no edge from s z which is labeled 
(h(i),e), ( e,9(h(a ))), or ( h(i),9(h(a ))) for i G An and a G E, we obtain that (x,y) cannot 
not be accepted by t 2 . 

Suppose 9[w) ^ t z (w) and w G h(uv) for words u G S + and v G A£. Following our 
notion from the previous paragraph, u is the longest word in E* such that h(u) is a prefix 
of w, and v is the longest word in A* t such that 9{h{y)) is a prefix of 9{w). Therefore, we 
obtain that u — z v . . • • ■ z v „,. □ 

N l 1 ] 

This leads to the undecidability of the maximality problem of a regular language L 
with respect to a 0 -transducer-property Sg tt - 

Corollary 26. For every fixed antimorphic permutation 9 over A* k with k > 2, it is unde- 
cidable whether or not the empty language 0 is maximal with respect to the property Sg^, 
for a given transducer t. 

Proof. Clearly, the empty language satisfies Sg tt - For a word w, the language {w} satisfies 
Sg t if and only if 9{w) t (w). Therefore, 0 is maximal with property Sg t if and only if t 
is 0-input-preserving. Theorem 25 concludes the proof. □ 

6 Undecidability of the 0-PCP and the ^-input-altering 
Transducer Problem 

Analogous to the undecidable PCP (see Definition 24), we introduce the 0 version of the 
PCP and prove that it is undecidable as well; see Theorem 28. Further, we utilize the 0 
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version of the PCP in order to show that it is undecidable whether or not a transducer is 
^-input-altering; see Corollary 29. 

Definition 27. For a fixed antimorphic permutation 9 over A* k , we introduce the 9-Post 
correspondence problem (0-PCP): given words a 0 , ay,..., a^_i G A k and Ao, Ai, ■ ■ ■ ■> /3^_i G 
Af, decide whether or not there exists a non-empty sequence of integers A ,..., i n G — 
{ 0 , 1 , 1 } such that 


Cpi Ot<i 2 * * * 9 ( Ai\ Ai^ * * * Ai n ) • 

Theorem 28. For every fixed antimorphic permutation 9 over A* k with k > 2 the 9-PCP 
is undecidable. 


Proof. In order to prove that 0-PCP is undecidable, we will state an effective reduction 
of any PCP instance A over alphabet A 2 to a 0-PCP instance T over alphabet Ak such 
that A has a solution if and only if T has a solution. Let «o, aq,..., a^_i £ Af and 
/ 3 0 , /?i,..., Ae -i G Af be an instance of the PCP which we call A. 

Note that 9 and 9~ 1 are well-defined over A 2 C A^. We define two morphisms g, h on 
A 2 such that 


0(0) = 00, 0(1) = 01, M0) = 10, Ml) = 11- 

Note that for each pair of letters z G A 2 we have either z G h(A 2 ) or z G g(A 2 ). Moreover, 
we let 


7 j = 9(013), Sj = 0 - 1 (M/?f))> 

7 < = M°), < 5 ,- 0 - 1 ( 0 (°)), 

7^+1 = MM ^+1 = 0 _ 1 ( 0 (i))- 


be the 0-PCP instance T. 


for j = 0 , 1 , 


7ii 7i 2 

'0^06(6^09(6^0 

g(w) 


h{w R ) 

7i„ 7^ ••• 7ij 7i^ li' 3 7i' 7»i 

0&J W) 777 MhJ MhJ 


Figure 6: Transforming the solution i\Ai, ■ ■ ■ An of the PCP instance A into the solution 
i 2 ,..., i n , i' m , ■ ■ ■ Ai °f the 0-PCP instance T; all variables are defined in the text. 

First, let us show that if A has a solution than T has a solution as well. Let ii, 12 , ■ ■ ■ An £ 
Ai with n > 1 be a solution of the PCP instance A and let w be the word corresponding 
to this solution; i. e., 

UJ Cpi 0^2 * * * Ai, 1 fi .2 ' * ' Ai n * 
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Figure 6 illustrates the following construction. Let m = |w|. For j — 1,..., m we let i'- = £ 
if ivyj = 0 and i'j = £ + 1 if w^j = 1; these indeces are chosen such that 

A’ m l C_r"7ii = Kw R ), 

tii'Ji’m -1 ' "fy = rl (9(»H)) rl W»[m-l])) ' ' • 0 ~ 1 (^(^[1])) = (flW). 

The integer sequence i 2 , ■ ■ ■, i n , i' m , i' m -i, ■ ■ ■, i[ is a solution of the 0-PCP instance f(a) 
because 


0(<S*1 • • ' 5 iA'm • • • AJ = d ( S Hn ‘ ' • fy) ' 6 ( 5 0 ' ' ' ^l) 

= 0(0 _1 (pH)) • W 1 ^))) • • • ^r 1 ^))) 

= g(w) ■ HP*) ■ ■ ■ h(p*) 

= ff(aii) ■ ■ ■ g( a iu) • h(w R ) 

= Til ■■■Tin -Ti^-’-Tii- 

Vice versa, let ii,i 2 ,... ,i n G A eU+2 with n > 1 be a solution of the d-PCP instance T 
and let ru be the word corresponding to this solution, that is, 

w = 7ii7i 2 • • • 7i„ = WiA a • • • AJ = AAJ ■ ■ ■ AA 2 )AAJ. 

Recall that for every word 7 *. we have that either 7 * 6 c/(A[) (in case ij < £) or 7 j G /i( 24 2 ) 
(in case ij > £). Since g(A 2 ) and h(A 2 ) contain mutually distinct two-letter words, for every 
pair of letters p = W[ 2r -\- 2r ] with r G N: if p G g(A 2 ), then p is covered by a factor 77 . 
with ij < £; and if p 7 h(A 2 ), then p equals to a factor 7 $. with 7 > L Symmetrically, for 
p = ir’[ 2 j— 1 ; 2 r] with r G N: if p G h(A 2 ), then p is covered by a factor 9(5 7 ) with 7 < and 
if p G <7(^4 2 ), then p equals to a factor 0(A) with % 3 > l. 

6 5 (^ 2 ") S ) 

7 »i 7*2 ''' 7 *„/ 77 7*3+1 7*3+2 7*3+3 ''' 7 *n 

Wj ••• """"^7) 777 0(^j 

Figure 7: Transforming the solution i\,i 2 ,... ,i n of the 0-PCP instance T into the solution 
ii, i 2 ,..., i n t of the PCP instance A ; all variables are defined in the text. 

Consider the case where i\ < l. Figure 7 illustrates the following construction. In 
this case, 7 ^ = g(a il ) is a prehx of w and 9(5^) = h(/3 R ) is a suffix of w, thus, W[i ; 2 ] G 
g(A 2 ) and W[|„,|-i ; | 3 „|] G h(A 2 ). Further, we obtain that i n > £ because % n has to cover 
■u^M—i;M] e 9(A 2 ). There exists an integer n' with 1 <n'<n such that i\,i 2 ,..., i n > < £ 
but i n '+1 > £■ We will show that the sequence i\,i 2 , ...An' is a solution of the PCP 
instance A by comparing the longest prehx of w which belongs to g(A^) with the longest 
suffix of w which belongs to /i(Aj'). Let m be an even integer such that W[i ;m ] G g(A^) 
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but W[ m+ i ;m + 2 ] G h(A 2 ). Because i n '+\ bas to match with the first letter pair in w which 
belongs to h(A 2 ), it is not difficult to see that 

W[l-m] = lixlii ■ ■ ■ 7v = s( a ii“i 2 • • • «*„/)■ 

Because W[ i ;TO ] G g(A^) and W[ m+ i ;m + 2 ] G h(A 2 ), there exists an integer j < n such that 
ij, ij. |_i ... ,i n > £, ij_i < £, and 

= #(<Mb+i • • • 4 J = 0(0 ■ ■■e( 5 ij+ 1 )e( 5 i j ). 

Due to the design of the word pairs (7 p, Sp) and ( 7 ^+ 1 , h^+i) and because 

0(5 in ) ■ ■■9(5 ij+ 1 )9(8 ij ) = g(a h a i2 ■ ■ -a in ,) 

is a prefix of w, we have that jijji j+1 • • - 7 i n = ■ ■ - a* ,) R ) is a suffix of w. Since 

ij -1 < £, we see that this suffix ■ ■ ■ Oii n ,) R ) of w is preceded by a letter pair from 

g(A 2 ). This implies that the suffix 9(Si n ,) ■ ■ ■ 9(Si 2 )9(Si 1 ) of w equals ■ ■ -oti ,) R ). 

Therefore, 

h((a h a i2 ■ ■ ■ a iri ,) R ) = 9(8 in ,) ■ ■ ■ 9(8 i 2 )9(S h ) 

= Hp R )---h(p R )h(p R ) 

= h((AiA 2 ---A n ,) R )- 

We conclude that ■ ■ ■ Q>i n , = A 1 A 2 ■ ■ ■ [ 3 t , and, therefore, ii, i 2 ,..., i n t is a solution of 

the PCP instance A. 

The case when i\> £ can be treated analogously, where we compare the longest prefix 
of w which belongs to IfiAf) and the longest suffix of w which belongs to g(Af). In this 
case, there exists n' < n such that i n f, i n >+ 1 , ..., i n is a solution of the PCP instance A. □ 

We can utilize the 0-PCP in order to prove that it is undecidable whether or not a 
transducer is ^-input-altering, even for one-state transducers. 

Corollary 29. For every fixed antimorphic permutation 9 over A* k with k >2 it is unde¬ 
cidable whether or not a given (one-state) transducer is 9-input-altering. 

Proof. Let 07 ,..., ay-i G A k and fio, fii ,..., A-i G A k be the 0-PCP instance A. We 
let be the one-state transducer shown in Fig. 8. Clearly, we have y G t_ 4 (a;) if and 
only if there exists an integer sequence ii,i 2 ,... ,i n G Ap such that x = ■ ■ ■ a in and 

y = 9 2 (/3 il )9 2 ((3 i2 ) ■ ■ ■9 2 {(3 in ) = 0 2 (AiA 2 ■ ■ ■ (3 ln ); note that 9 2 is always morphic, even if 9 
is not. 

Recall that it is allowed for 9- input-altering transducers to accept the empty word pair 
(e, e). We have w G 0~ 1 (T.a(' u; )) for some word w G A k if and only if there exists an integer 
sequence i\,i 2 ,... ,i n such that 

«n «>2 ■■■a in =w = 9~ 1 (0 2 (A iA 2 • • • An)) = AA 1 A 2 • • • Pin)- 

Therefore, is 0-input-altering if and only if the 0-PCP instance A has a solution. The¬ 
orem 28 concludes the proof. □ 


25 


Vi £ A t : (ai,6 2 (/3i)) 



Figure 8: encodes the 0-PCP instance «o, an,..., ag-i, /3 q, /3i, ..., 

7 A Hierarchy of DNA-related ^-transducer Proper¬ 
ties 

In [13,18,19] the authors consider numerous properties of languages inspired by reliability 
issues in DNA computing. Let 0 be defined over A* and assume that 9 2 = id since in 
the DNA setting 9 = 5 is an involution. The relationships between some of the defined 3- 
independent DNA-related properties are displayed in Fig. 9. All properties have in common 
that they forbid certain “constellations” of words. Consider a language L C A + and two 
words uwv, 9{xwy ) G A + with w e as shown in the top property in Fig. 9. The same 
notation can be employed for all properties in the figure, where some properties require 
that x, y, u, or v are empty, e. g., for x = y = e we obtain the ^-compliant property. In 
the case of 0-nonoverlapping all of x, y, u, v are empty and 

(A) a language L is 9-nonoverlapping if for all w G A + , we have w L or 9(w) ^ L. 
This is equivalent to require that L D 9{Li) = 0. 

For all properties, except 0-nonoverlapping, the language L has property P, if uwv G L 
and 9{xwy ) G L implies that uvxy = e. For example, 

(B) a language L is 9-compliant if for all w G A + and u,v G A*, we have uwv,9{w ) G 
L => uv = e\ and 

(C) a language L is 9-5'-overhang-free if for all w G A + and u,y G A*, we have 
uw, 9{wy) G L ==>• uy = e. 

Previous papers considered the strict version only for some of the properties. Here, we 
generalize the concept of strict properties such that if uwv G L and 9{xwy) G L, then L 
does not satisfy the strict property P s (even if uvxy = e). For example, 

(D) a language L is strictly 9-compliant if for all w G A + and for all u,v G A*, we have 
uwv ^ L or 9{w) f L; and 

(E) a language L is strictly 9-5'-overhang-free if for all w G A + and u,y £ A*, we have 
uw fi L or 9{wy) ^ L. 

Note that 0-nonoverlapping is actually a strict property while its “normal version” 
would be the property that is trivially satisfied by every language in A + . 

Furthermore, we introduce the weak version of a property which follows the concept of 
classic code properties like the (weakly) overlap-free property where it is allowed for a word 
to overlap with itself, but not with another word: for a language L which satisfied the weak 
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££ 



Figure 9: Correlation of various 3-independent DNA language properties. For each prop¬ 
erty the forbidden constellation of words (or single strands) is depicted. Words are repre¬ 
sented as arrows such that the first letter (the S'-end) is the blunt end of the arrow and 
last letter (the 3'-end) is the arrow tip. Red, vertical lines represent bonding between 
^-complementary parts of the two words. 

property P w , if the words uwv and 9(xwy ) belong to L, then uvxy = e or uwv = 9{xwy). 
For example, 

(F) a language L is weakly 9-5'-overhang-free if for all w G A + and u,y G A *, we have 
uw, 9(wy) G L implies uy = £ or uw = 9(wy). 

Note that for some properties, like ^-compliant, the weak property P w coincides with 
the (normal) property P. 

If a language L satisfies the strict property P s , then it also satisfies the corresponding 
(normal) property P; and if L satisfies the (normal) property P, then it also satisfies 
the corresponding weak property P w . Furthermore, there is a normal, strict, and weak 
hierarchy of properties which is shown in Fig. 9, where 0-nonoverlapping only exists in the 
strict hierarchy. For all three hierarchies an arrow P x —> Q x (for x G {e,s,w}) between 
two properties P x and Q x means that if a language L satisfies property P x , then it also 







































































satisfies property Q x . 

Let us discuss how these properties can be described as 0-transducer properties. The 
type of the property (W-property or 5-property) and the type of the transducer (unre¬ 
stricted, 0-input-altering, 0-input-preserving) is important when it comes to the complexity 
of the satisfaction problem and the decidability of the maximality problem; see Table 1. 
Firstly, observe that L is 0-nonoverlapping if L satisfies the 0-transducer property S$ t t id 
where t; c i is a transducer realizing the identity relation. Since any strict property, in¬ 
cluding 0-nonoverlapping, is not satisfied by a singleton language {w} that consists of 
one 0-palindrome w = 9(w), strict properties cannot be described as 5-properties by a 
0-input-altering transducer or as W-properties, according to Remark 7. 

Figure 10 shows two families of transducers which are capable of describing any of the 
DNA-related properties that we introduced in this section. Depending on whether or not u 
(resp., v, x, y ) is empty one has to omit a set of edges in each transducer. The 5-properties 
Sqx s describe the strict properties, the 5-properties 5« tw describe the normal properties, 
and the W-properties Wo :tw describe the weak properties. If we omit red and orange 
edges (i.e., xy = e), then t w is 0-input-altering because the input word is strictly longer 
than the output word. Therefore, Sg ttw = W<j,t w , i.e., the normal property coincides with 
the corresponding weak property. The case when all blue and green edges are omitted 
is symmetric when input and output swap roles. We demonstrate this construction in 
Examples 30 and 31. 


omit if x = e 


(e,a) 



. {a,e) _ 

omit if u = e 


omit if y = e 



{a, a) 


t w • 



(a, a) 


Figure 10: The family of transducers which describes all properties shown in Fig. 9. Each 
of the two transducer families describes 16 different transducers: We can either omit or 
include each of the red, orange, blue and green edges. These edges are omitted depending 
on the property that is described, for example, omit all red edges if x — £ in Fig. 9. 


Example 30. Let and t(( be the two transducers that are obtained by omitting all red 
and orange edges in t s and t w (Fig. 10), respectively. Then 5 0 t c is the strict 0-compliant 
property, whereas S e t c is the (normal) 0-compliant property. Since is 0-input-altering, 
So, t c is equal to W^tc and the properties 0-compliant and weak 0-compliant coincide. 




















Example 31. Let tg OF and t rF)f be the two transducers that are obtained by omitting 
all red and green edges in t s and t w (Fig. 10), respectively. Then 5 0t 5OF is the strict 0-5'- 
overhang-free property, Sg t soF is the (normal) 0-5'-overhang-free property, and V9q^f is 
the weak 0-5'-overhang-free property. 

Observe that the word z = AACG can have a 0-5'-overhang with itself (as x = AA, 
w = 9(w) = CG, and y = TT). As expected, t^ F does accept the word pair (AACG, CGTT) 
and, therefore, the singleton language {z} does not satisfy the (normal) 0-5'-overhang-free 
property S g t soF, however, {z} does satisfy the weak 0-5'-over hang-free property W^tsoF. 

Lastly, note that the (strict, weak) 0-overhang-free property is different from the other 
properties in Fig. 9 in so far that it forbids two word constellations: 0-5'-overhangs and 0-3'- 
overhangs. This property can be described by a transducer which contains two components, 
where one component covers the 0-5'-overhangs and the other component covers the 0-3'- 
overhangs. 

8 Conclusions 

We have defined a transducer-based method for describing DNA code properties which is 
strictly more expressive than the trajectory method. In doing so, the satisfaction question 
remains efficiently decidable. The maximality question for some types of properties is 
decidable, but it is undecidable for others. While some versions of the maximality question 
for trajectory properties are decidable, the case of any given pair of regular trajectories and 
any given regular language is not addressed in [6], so we consider this to be an interesting 
problem to solve. 

The maximality questions are phrased in terms of any fixed antimorphic permutation. 
This direction of generalizing decision questions is also applied to the classic Post Corre¬ 
spondence Problem, where we demonstrate that it remains undecidable. A consequence 
of this is that the question of whether a given transducer is 0-input-altering is also unde¬ 
cidable. It is interesting to note that if, instead of fixing 0, we fix the transducer t to be 
the identity, or the transducer defining the 5-property V. (see Fig. 3 in Sect. 4), then the 
question of whether or not 

9(L) Cl t(L) = 0 

is decidable (given any regular language L and antimorphic permutation 0). 

The topic of studying description methods for code properties requires further attention. 
One important aim is the actual implementation of the algorithms, as it is already done 
for several classic code properties [9,24], An immediate plan is to incorporate in those 
implementations what we know about DNA code properties. Another aim is to increase 
the expressive power of our description methods. The formal method of [16] is quite 
expressive, using a certain type of first order formulae to describe properties. It could 
perhaps be further worked out in a way that some of these formulae can be mapped to 
transducers. We also note that if the defining method is too expressive then even the 
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satisfaction problem could become undecidable; see for example the method of multiple 
sets of trajectories in [7]. 
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